在本文中,我们提出了GT-GDA,这是一种分布式优化方法来解决表单的鞍点问题:$ \ min _ {\ Mathbf {x}}} \ max _ {\ Mathbf {y Mathbf {y}}} \ {f( 。 $,其中函数$ g(\ cdot)$,$ h(\ cdot)$,以及耦合矩阵$ \ overline {p} $的耦合矩阵{p} $是在强烈连接的节点网络上分发的。 GT-GDA是一种使用梯度跟踪来消除节点之间异质数据分布引起的差异的一阶方法。在最通用的形式中,GT-GDA包括与本地耦合矩阵的共识,以达到最佳(独特的)鞍点,但是,以增加通信为代价。为了避免这种情况,我们提出了一个更有效的变体GT-GDA-LITE,该变体不会引起额外的交流并在各种情况下分析其收敛性。我们表明,当$ g(\ cdot)$平滑且凸,$ h(\ cdot)$平稳且强烈凸时,GT-GDA线性收敛到唯一的鞍点解决方案,并且全局耦合矩阵$ \ overline {p } $具有完整的列等级。我们进一步表征了GT-GDA表现出与网络拓扑无关的收敛行为的制度。接下来,我们显示GT-GDA的线性收敛到围绕唯一鞍点的错误,当耦合成本$ {\ langle \ mathbf y,\ overline {p} \ mathbf x \ rangle} $是零时为零。所有节点,或当$ g(\ cdot)$和$ h(\ cdot)$是二次时。数值实验说明了GT-GDA和GT-GDA-LITE对多种应用的收敛属性和重要性。
translated by 谷歌翻译
由于Facebook重命名为Meta,因此对Metaverse是什么,其工作原理以及可能利用它的可能方法进行了很多关注,辩论和探索。可以预料,Metaverse将成为迅速新兴技术,用户酶,能力和经验的连续性,这些技术将弥补这一目标的下一个互联网发展。一些研究人员已经调查了有关人工智能(AI)和无线通信的文献,以实现元评估。但是,由于技术的迅速出现,需要对AI,6G和两者在实现元元体验中的AI,6G和Nexus的作用进行全面和深入的评论。因此,在这项调查中,我们首先介绍了增强现实(AR),虚拟现实(VR),混合现实(MR)和空间计算的背景和持续进展,其次是AI和6G的技术方面。然后,我们通过回顾深度学习,计算机视觉和边缘AI中最新的AI来调查AI在元评估中的作用。接下来,我们研究了B5G/6G对Metaverse的有前途的服务,然后确定AI在6G网络和6G网络中的作用在AI中为支持元应用程序。最后,我们征集了现有的和潜在的应用程序,用户赛和项目,以强调元元中进步的重要性。此外,为了向研究人员提供潜在的研究指示,我们从上述技术的文献综述中提出了挑战,研究差距和经验教训。
translated by 谷歌翻译
过去十年迅速采用了人工智能(AI),特别是深度学习网络,在医学互联网上(IOMT)生态系统。然而,最近已经表明,深度学习网络可以通过对抗性攻击来利用,这不仅使得IOMT易受数据盗窃,而且对医学诊断的操纵。现有的研究考虑将噪声添加到原始IOMT数据或模型参数中,这不仅可以降低医学推断的整体性能,而且对从梯度方法的深度泄漏的喜好是无效的。在这项工作中,我们提出了近端渐变分流学习(PSGL)方法,用于防范模型反演攻击。所提出的方法故意在客户端进行深度神经网络培训过程时攻击IOMT数据。我们建议使用近端梯度方法来恢复梯度图和决策级融合策略以提高识别性能。广泛的分析表明,PGSL不仅为模型反演攻击提供有效的防御机制,而且有助于提高公共可用数据集的识别性能。我们分别在重建和对冲攻击图像中准确地报告17.9美元\%$和36.9美元。
translated by 谷歌翻译
部分微分方程(PDE)在许多复杂动态过程的数学建模中发挥着主导作用。解决这些PDE通常需要预定的计算成本,特别是当必须对不同的参数或条件进行多次评估时。在培训之后,神经运营商可以比传统的PDE溶剂更快地提供PDES解决方案。在这项工作中,检查两个神经运营商的不变性属性和计算复杂性,用于标量数量的运输PDE。基于图形内核网络(GKN)的神经运算符在图形结构数据上运行,以合并非识别依赖性。在这里,我们提出了改进的GKN制定以实现帧不变性。传染媒介云神经网络(VCNN)是一个具有嵌入式帧不变性的替代神经运算符,可在点云数据上运行。基于GKN的神经运营商与VCNN相比,略微更好地预测性能。然而,GKN需要过度高的计算成本,与VCNN的线性增加相比,随着越来越多的离散物对象而直角增加。
translated by 谷歌翻译
Diabetic Retinopathy (DR) is considered one of the primary concerns due to its effect on vision loss among most people with diabetes globally. The severity of DR is mostly comprehended manually by ophthalmologists from fundus photography-based retina images. This paper deals with an automated understanding of the severity stages of DR. In the literature, researchers have focused on this automation using traditional machine learning-based algorithms and convolutional architectures. However, the past works hardly focused on essential parts of the retinal image to improve the model performance. In this paper, we adopt transformer-based learning models to capture the crucial features of retinal images to understand DR severity better. We work with ensembling image transformers, where we adopt four models, namely ViT (Vision Transformer), BEiT (Bidirectional Encoder representation for image Transformer), CaiT (Class-Attention in Image Transformers), and DeiT (Data efficient image Transformers), to infer the degree of DR severity from fundus photographs. For experiments, we used the publicly available APTOS-2019 blindness detection dataset, where the performances of the transformer-based models were quite encouraging.
translated by 谷歌翻译
While the brain connectivity network can inform the understanding and diagnosis of developmental dyslexia, its cause-effect relationships have not yet enough been examined. Employing electroencephalography signals and band-limited white noise stimulus at 4.8 Hz (prosodic-syllabic frequency), we measure the phase Granger causalities among channels to identify differences between dyslexic learners and controls, thereby proposing a method to calculate directional connectivity. As causal relationships run in both directions, we explore three scenarios, namely channels' activity as sources, as sinks, and in total. Our proposed method can be used for both classification and exploratory analysis. In all scenarios, we find confirmation of the established right-lateralized Theta sampling network anomaly, in line with the temporal sampling framework's assumption of oscillatory differences in the Theta and Gamma bands. Further, we show that this anomaly primarily occurs in the causal relationships of channels acting as sinks, where it is significantly more pronounced than when only total activity is observed. In the sink scenario, our classifier obtains 0.84 and 0.88 accuracy and 0.87 and 0.93 AUC for the Theta and Gamma bands, respectively.
translated by 谷歌翻译
This paper presents our solutions for the MediaEval 2022 task on DisasterMM. The task is composed of two subtasks, namely (i) Relevance Classification of Twitter Posts (RCTP), and (ii) Location Extraction from Twitter Texts (LETT). The RCTP subtask aims at differentiating flood-related and non-relevant social posts while LETT is a Named Entity Recognition (NER) task and aims at the extraction of location information from the text. For RCTP, we proposed four different solutions based on BERT, RoBERTa, Distil BERT, and ALBERT obtaining an F1-score of 0.7934, 0.7970, 0.7613, and 0.7924, respectively. For LETT, we used three models namely BERT, RoBERTa, and Distil BERTA obtaining an F1-score of 0.6256, 0.6744, and 0.6723, respectively.
translated by 谷歌翻译
This is paper for the smooth function approximation by neural networks (NN). Mathematical or physical functions can be replaced by NN models through regression. In this study, we get NNs that generate highly accurate and highly smooth function, which only comprised of a few weight parameters, through discussing a few topics about regression. First, we reinterpret inside of NNs for regression; consequently, we propose a new activation function--integrated sigmoid linear unit (ISLU). Then special charateristics of metadata for regression, which is different from other data like image or sound, is discussed for improving the performance of neural networks. Finally, the one of a simple hierarchical NN that generate models substituting mathematical function is presented, and the new batch concept ``meta-batch" which improves the performance of NN several times more is introduced. The new activation function, meta-batch method, features of numerical data, meta-augmentation with metaparameters, and a structure of NN generating a compact multi-layer perceptron(MLP) are essential in this study.
translated by 谷歌翻译
We present a novel dataset named as HPointLoc, specially designed for exploring capabilities of visual place recognition in indoor environment and loop detection in simultaneous localization and mapping. The loop detection sub-task is especially relevant when a robot with an on-board RGB-D camera can drive past the same place (``Point") at different angles. The dataset is based on the popular Habitat simulator, in which it is possible to generate photorealistic indoor scenes using both own sensor data and open datasets, such as Matterport3D. To study the main stages of solving the place recognition problem on the HPointLoc dataset, we proposed a new modular approach named as PNTR. It first performs an image retrieval with the Patch-NetVLAD method, then extracts keypoints and matches them using R2D2, LoFTR or SuperPoint with SuperGlue, and finally performs a camera pose optimization step with TEASER++. Such a solution to the place recognition problem has not been previously studied in existing publications. The PNTR approach has shown the best quality metrics on the HPointLoc dataset and has a high potential for real use in localization systems for unmanned vehicles. The proposed dataset and framework are publicly available: https://github.com/metra4ok/HPointLoc.
translated by 谷歌翻译
Objective: Despite numerous studies proposed for audio restoration in the literature, most of them focus on an isolated restoration problem such as denoising or dereverberation, ignoring other artifacts. Moreover, assuming a noisy or reverberant environment with limited number of fixed signal-to-distortion ratio (SDR) levels is a common practice. However, real-world audio is often corrupted by a blend of artifacts such as reverberation, sensor noise, and background audio mixture with varying types, severities, and duration. In this study, we propose a novel approach for blind restoration of real-world audio signals by Operational Generative Adversarial Networks (Op-GANs) with temporal and spectral objective metrics to enhance the quality of restored audio signal regardless of the type and severity of each artifact corrupting it. Methods: 1D Operational-GANs are used with generative neuron model optimized for blind restoration of any corrupted audio signal. Results: The proposed approach has been evaluated extensively over the benchmark TIMIT-RAR (speech) and GTZAN-RAR (non-speech) datasets corrupted with a random blend of artifacts each with a random severity to mimic real-world audio signals. Average SDR improvements of over 7.2 dB and 4.9 dB are achieved, respectively, which are substantial when compared with the baseline methods. Significance: This is a pioneer study in blind audio restoration with the unique capability of direct (time-domain) restoration of real-world audio whilst achieving an unprecedented level of performance for a wide SDR range and artifact types. Conclusion: 1D Op-GANs can achieve robust and computationally effective real-world audio restoration with significantly improved performance. The source codes and the generated real-world audio datasets are shared publicly with the research community in a dedicated GitHub repository1.
translated by 谷歌翻译